Non-stationary signal processing and its application in speech recognition
نویسندگان
چکیده
The most widely used acoustic feature extraction methods of current automatic speech recognition (ASR) systems are based on the assumption of stationarity. In this paper we extensively evaluate a recently introduced filter stable, non-stationary signal processing method, which relies on an adaptive parttone decomposition of voiced speech to obtain alternative feature vectors for ASR. The non-stationary filterbank allows for more noise robust amplitude based features by suppressing the between-harmonics regions. Furthermore, by adapting the center filter frequencies to the underlying acoustic modes, it is possible to obtain useful phase features which can be interpreted in terms of the non-stationary dynamics within the vocal tract. The features are evaluated on different tasks ranging from vowel classification up to large vocabulary continuous speech recognition.
منابع مشابه
Application of Signal Processing Tools for Fault Diagnosis in Induction Motors-A Review-Part II
The use of efficient signal processing tools (SPTs) to extract proper indices for the fault detection in induction motors (IMs) is the essential part of any fault recognition procedure. The 2nd part of this two-part paper is, in turn, divided into two parts. Part two covers the signal processing techniques which can be applied to non-stationary conditions. In this paper, all utilized SPTs for n...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملSynchrosqueezing-based Transform and its Application in Seismic Data Analysis
Seismic waves are non-stationary due to its propagation through the earth. Time-frequency transforms are suitable tools for analyzing non-stationary seismic signals. Spectral decomposition can reveal the non-stationary characteristics which cannot be easily observed in the time or frequency representation alone. Various types of spectral decomposition methods have been introduced by some resear...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کامل